The reader is a procedure available in the standard environment as
the value of the variable READ-OBJECT. Conceptually, the reader
coerces a stream of characters (external representation) to a stream of
objects (internal representations) via a mechanism
known as parsing.
The reader works as follows:
Any whitespace characters (space, tab, newline, carriage return, line
feed, or form feed) are read and ignored. A non-whitespace character is
obtained; call it c.
If c is a read-macro character, the reader invokes a specialist routine
to handle a syntactic construct introduced by the read-macro character.
If c is not a read-macro character, then characters are read and
saved until a delimiter character is read. A delimiter character is
either a whitespace character, or one of the following: ( (left
parenthesis), ) (right parenthesis), [, ], {,
}, or ; (semicolon). If the sequence of characters beginning
with c and going up to but not including the delimiter is parsable
as a number, then the sequence is converted to a number, which is
returned.
Otherwise the sequence is converted to a symbol.
The escape character, backslash (
\), may be used within a
run of constituent characters to unusual characters in a
symbol's print name. In this case, the escaped character (i.e. the character
following the escape character) is treated as if it were a constituent
character, and is not converted to upper case if it is a lower case
letter. For example:
The following are standard read-macro characters:
- "
- Doublequote: introduces a string. Characters are read until another doublequote
character is found which does not immediately follow a backslash (
\)
and a string is returned. Within a string, backslash acts as an escape
character, so that doublequotes and backslashes may appear in strings.
- '
- Quote: 'object reads the same as (QUOTE object).
- (
- Left parenthesis: begins a list.
- )
- Right parenthesis: ends a list or vector, and is illegal in other
contexts.
- `
- Quasiquote: see section .
- ,
- Comma: this is part of the backquote syntax.
- @
- At sign: this is part of the backquote syntax.
- ;
- Semicolon: introduces a comment. Characters are read and discarded
until a newline is encountered, at which point the parsing process starts over.
- #
- Sharp-sign: another dispatch to a specialist routine is
performed according to the character following the #.
Standard sharp-sign forms:
- #
\
- Character syntax. See section .
- #x
- Hexadecimal input. An integer following the #x
is read in base 16.
- #o
- Octal input. An integer following the #o is read in base 8.
- #b
- Binary input. An integer following the #b is read in base 2.
- #(... )
- Vector. The elements of a vector are read between the parentheses,
and the vector is returned.
- #[... ]
- This syntax is used for certain kinds of re-readable objects.
It also provides an alternate syntax for characters and symbols.
The brackets enclose a sequence of objects; the first should be a symbol
which keys the type of the resulting object, e.g. CHAR or SYMBOL.
For example,
This syntax is used by the printer when necessary, for example:
- #{... }
- This is the syntax used by the printer for objects which
have no reader syntax. When the reader encounters the sequence #{
it signals an error.